由于特征陈述学习微妙对象细节的难度,无监督的细粒度类聚类是实用但具有挑战性的任务。我们介绍C3-GaN,一种通过应用对比学习来利用Infogan的分类推理能力的方法。我们的目标是学习鼓励数据在嵌入空间中形成不同的群集边界的特征表示,同时还可以最大化潜在代码和其观察之间的相互信息。我们的方法是培训用于推断群集的鉴别器,以优化对比损失,其中最大化互信息的图像潜在的成对被认为是正对,其余部分作为负对对。具体而言,我们映射从分类分布中采样的生成器的输入,以识别鉴别器的嵌入空间,并让它们充当群集质心。以这种方式,实现了C3-GaN,以了解一个聚类友好的嵌入空间,其中每个群集都是独特的分离的。实验结果表明,C3-GaN在四个细粒度的基准数据集上实现了最先进的聚类性能,同时还减轻了模式崩溃现象。
translated by 谷歌翻译
近年来,大规模的深层模型取得了巨大的成功,但巨大的计算复杂性和大规模的存储要求使其在资源限制设备中部署它们是一个巨大的挑战。作为模型压缩和加速度方法,知识蒸馏通过从教师探测器转移黑暗知识有效提高了小型模型的性能。然而,大多数基于蒸馏的检测方法主要模仿近边界盒附近的特征,这遭受了两个限制。首先,它们忽略边界盒外面的有益特征。其次,这些方法模仿一些特征,这些特征被教师探测器被错误地被视为背景。为了解决上述问题,我们提出了一种新颖的特征性 - 丰富的评分(FRS)方法,可以选择改善蒸馏过程中的广义可检测性的重要特征。所提出的方法有效地检索边界盒外面的重要特征,并消除边界盒内的有害特征。广泛的实验表明,我们的方法在基于锚和无锚探测器上实现了出色的性能。例如,具有Reset-50的RetinAnet在Coco2017数据集上达到39.7%,甚至超过基于Reset-101的教师检测器38.9%甚至超过0.8%。
translated by 谷歌翻译
在输入图像的限制区域中工艺像素的对抗贴片攻击在物理环境中表明了它们在物理环境中的强大攻击效果。现有的认证防御对逆势补丁攻击的攻击良好,如MNIST和CIFAR-10数据集,但在图像上的更高分辨率图像上达到非常差的认证准确性。迫切需要在行业级更大的图像中针对这种实际和有害的攻击设计强大和有效的防御。在这项工作中,我们提出了认证的国防方法,以实现高分辨率图像的高可规范稳健性,并且在很大程度上提高了真正采用认证国防的实用性。我们的工作的基本洞察力是对抗性补丁打算利用局部表面的重要神经元(SIN)来操纵预测结果。因此,我们利用基于SIN的DNN压缩技术来通过减少搜索开销和过滤预测噪声的对抗区域来显着提高认证准确性。我们的实验结果表明,认证准确性从想象成数据集中的36.3%(最先进的认证检测)增加到60.4%,在很大程度上推动了实际使用的认证防御。
translated by 谷歌翻译
The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}^{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}^{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}^{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}^{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}^{3}$G-NeRF.
translated by 谷歌翻译
In both terrestrial and marine ecology, physical tagging is a frequently used method to study population dynamics and behavior. However, such tagging techniques are increasingly being replaced by individual re-identification using image analysis. This paper introduces a contrastive learning-based model for identifying individuals. The model uses the first parts of the Inception v3 network, supported by a projection head, and we use contrastive learning to find similar or dissimilar image pairs from a collection of uniform photographs. We apply this technique for corkwing wrasse, Symphodus melops, an ecologically and commercially important fish species. Photos are taken during repeated catches of the same individuals from a wild population, where the intervals between individual sightings might range from a few days to several years. Our model achieves a one-shot accuracy of 0.35, a 5-shot accuracy of 0.56, and a 100-shot accuracy of 0.88, on our dataset.
translated by 谷歌翻译
Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning (RL), but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality and outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.
translated by 谷歌翻译
The purpose of this work was to tackle practical issues which arise when using a tendon-driven robotic manipulator with a long, passive, flexible proximal section in medical applications. A separable robot which overcomes difficulties in actuation and sterilization is introduced, in which the body containing the electronics is reusable and the remainder is disposable. A control input which resolves the redundancy in the kinematics and a physical interpretation of this redundancy are provided. The effect of a static change in the proximal section angle on bending angle error was explored under four testing conditions for a sinusoidal input. Bending angle error increased for increasing proximal section angle for all testing conditions with an average error reduction of 41.48% for retension, 4.28% for hysteresis, and 52.35% for re-tension + hysteresis compensation relative to the baseline case. Two major sources of error in tracking the bending angle were identified: time delay from hysteresis and DC offset from the proximal section angle. Examination of these error sources revealed that the simple hysteresis compensation was most effective for removing time delay and re-tension compensation for removing DC offset, which was the primary source of increasing error. The re-tension compensation was also tested for dynamic changes in the proximal section and reduced error in the final configuration of the tip by 89.14% relative to the baseline case.
translated by 谷歌翻译
According to the rapid development of drone technologies, drones are widely used in many applications including military domains. In this paper, a novel situation-aware DRL- based autonomous nonlinear drone mobility control algorithm in cyber-physical loitering munition applications. On the battlefield, the design of DRL-based autonomous control algorithm is not straightforward because real-world data gathering is generally not available. Therefore, the approach in this paper is that cyber-physical virtual environment is constructed with Unity environment. Based on the virtual cyber-physical battlefield scenarios, a DRL-based automated nonlinear drone mobility control algorithm can be designed, evaluated, and visualized. Moreover, many obstacles exist which is harmful for linear trajectory control in real-world battlefield scenarios. Thus, our proposed autonomous nonlinear drone mobility control algorithm utilizes situation-aware components those are implemented with a Raycast function in Unity virtual scenarios. Based on the gathered situation-aware information, the drone can autonomously and nonlinearly adjust its trajectory during flight. Therefore, this approach is obviously beneficial for avoiding obstacles in obstacle-deployed battlefields. Our visualization-based performance evaluation shows that the proposed algorithm is superior from the other linear mobility control algorithms.
translated by 谷歌翻译
In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (X-MAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms. The latest information on the dataset and our study are available at https://github.com/lge-robot-navi, and the dataset will be available for download through a server.
translated by 谷歌翻译
Springs are efficient in storing and returning elastic potential energy but are unable to hold the energy they store in the absence of an external load. Lockable springs use clutches to hold elastic potential energy in the absence of an external load but have not yet been widely adopted in applications, partly because clutches introduce design complexity, reduce energy efficiency, and typically do not afford high-fidelity control over the energy stored by the spring. Here, we present the design of a novel lockable compression spring that uses a small capstan clutch to passively lock a mechanical spring. The capstan clutch can lock up to 1000 N force at any arbitrary deflection, unlock the spring in less than 10 ms with a control force less than 1 % of the maximal spring force, and provide an 80 % energy storage and return efficiency (comparable to a highly efficient electric motor operated at constant nominal speed). By retaining the form factor of a regular spring while providing high-fidelity locking capability even under large spring forces, the proposed design could facilitate the development of energy-efficient spring-based actuators and robots.
translated by 谷歌翻译